Overview

Dataset Statistics

Number of Variables 10
Number of Rows 532
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 83.9 KB
Average Row Size in Memory 161.4 B
Variable Types
  • Categorical: 3
  • Numerical: 7

Dataset Insights

Patent_count is skewed Skewed
Turnover_lay is skewed Skewed
Turnover_2012 is skewed Skewed
Total_assets_2012 is skewed Skewed
Employees_2012 is skewed Skewed
R&D_2012 is skewed Skewed
Country_code is skewed Skewed
ID has a high cardinality: 469 distinct values High Cardinality
Patent_industry has constant length 1 Constant Length
University has constant length 1 Constant Length

Variables

ID

categorical

Approximate Distinct Count 469
Approximate Unique (%) 88.2%
Missing 0
Missing (%) 0.0%
Memory Size 50.5 KB

Length

Mean 31.3647
Standard Deviation 16.7708
Median 27
Minimum 4
Maximum 124

Sample

1st row Dowa Electronics M...
2nd row Japan Science and ...
3rd row Otsuka Chemical Co...
4th row JSR CORPORATION
5th row Central Glass Co. ...

Letter

Count 14517
Lowercase Letter 10829
Space Separator 1739
Uppercase Letter 3688
Dash Punctuation 37
Decimal Number 8

Patent_industry

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.9%
Missing 0
Missing (%) 0.0%
Memory Size 34.3 KB

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 532
  • The top 2 categories (4, 3) take over 50.0%
  • Patent_industry has words of constant length

University

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Memory Size 34.3 KB
  • The largest value (0) is over 3.03 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 532
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 3.03 times larger than the second largest value (1)
  • University has words of constant length

Patent_count

numerical

Approximate Distinct Count 209
Approximate Unique (%) 39.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 2864.0113
Minimum 1
Maximum 204120
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Patent_count is skewed right (γ1 = 9.1975)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 5
Median 36
Q3 406.25
95-th Percentile 7311.7
Maximum 204120
Range 204119
IQR 401.25

Descriptive Statistics

Mean 2864.0113
Standard Deviation 17736.6762
Variance 3.1459e+08
Sum 1.5237e+06
Skewness 9.1975
Kurtosis 89.7817
Coefficient of Variation 6.1929
  • Patent_count is not normally distributed (p-value 4.581410075476347e-25)
  • Patent_count has 84 outliers

Turnover_lay

numerical

Approximate Distinct Count 235
Approximate Unique (%) 44.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 2.1067e+07
Minimum 0
Maximum 2.7667e+08
Zeros 1
Zeros (%) 0.2%
Negatives 0
Negatives (%) 0.0%
  • Turnover_lay is skewed right (γ1 = 4.4342)

Quantile Statistics

Minimum 0
5-th Percentile 64646.6
Q1 750000
Median 1.1976e+07
Q3 2.1067e+07
95-th Percentile 8.0459e+07
Maximum 2.7667e+08
Range 2.7667e+08
IQR 2.0317e+07

Descriptive Statistics

Mean 2.1067e+07
Standard Deviation 4.0387e+07
Variance 1.6311e+15
Sum 1.1207e+10
Skewness 4.4342
Kurtosis 22.7175
Coefficient of Variation 1.9171
  • Turnover_lay is not normally distributed (p-value 4.345040763358361e-19)
  • Turnover_lay has 43 outliers

Turnover_2012

numerical

Approximate Distinct Count 159
Approximate Unique (%) 29.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 3.1064e+07
Minimum 0
Maximum 3.7712e+08
Zeros 2
Zeros (%) 0.4%
Negatives 0
Negatives (%) 0.0%
  • Turnover_2012 is skewed right (γ1 = 4.5453)

Quantile Statistics

Minimum 0
5-th Percentile 228333.9
Q1 1.1609e+07
Median 3.1064e+07
Q3 3.1064e+07
95-th Percentile 7.8899e+07
Maximum 3.7712e+08
Range 3.7712e+08
IQR 1.9455e+07

Descriptive Statistics

Mean 3.1064e+07
Standard Deviation 3.5993e+07
Variance 1.2955e+15
Sum 1.6526e+10
Skewness 4.5453
Kurtosis 28.4896
Coefficient of Variation 1.1587
  • Turnover_2012 is not normally distributed (p-value 1.0735925034708026e-22)
  • Turnover_2012 has 40 outliers

Total_assets_2012

numerical

Approximate Distinct Count 161
Approximate Unique (%) 30.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 4.4125e+07
Minimum 923
Maximum 6.85e+08
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Total_assets_2012 is skewed right (γ1 = 5.4604)

Quantile Statistics

Minimum 923
5-th Percentile 259570.55
Q1 1.4328e+07
Median 4.4125e+07
Q3 4.4125e+07
95-th Percentile 1.1608e+08
Maximum 6.85e+08
Range 6.85e+08
IQR 2.9797e+07

Descriptive Statistics

Mean 4.4125e+07
Standard Deviation 5.8261e+07
Variance 3.3943e+15
Sum 2.3474e+10
Skewness 5.4604
Kurtosis 40.6962
Coefficient of Variation 1.3204
  • Total_assets_2012 is not normally distributed (p-value 3.6247602781160884e-22)
  • Total_assets_2012 has 36 outliers

Employees_2012

numerical

Approximate Distinct Count 132
Approximate Unique (%) 24.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 75574.8011
Minimum 12
Maximum 434246
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Employees_2012 is skewed right (γ1 = 2.7726)

Quantile Statistics

Minimum 12
5-th Percentile 828
Q1 75574.8011
Median 75574.8011
Q3 75574.8011
95-th Percentile 193321.5
Maximum 434246
Range 434234
IQR 0

Descriptive Statistics

Mean 75574.8011
Standard Deviation 61877.9373
Variance 3.8289e+09
Sum 4.0206e+07
Skewness 2.7726
Kurtosis 10.3806
Coefficient of Variation 0.8188
  • Employees_2012 is not normally distributed (p-value 2.5675475329033726e-24)
  • Employees_2012 has 176 outliers

R&D_2012

numerical

Approximate Distinct Count 118
Approximate Unique (%) 22.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 1.6777e+06
Minimum 1211
Maximum 1.0772e+07
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • R&D_2012 is skewed right (γ1 = 3.3229)

Quantile Statistics

Minimum 1211
5-th Percentile 31061.9
Q1 1.6777e+06
Median 1.6777e+06
Q3 1.6777e+06
95-th Percentile 3.6252e+06
Maximum 1.0772e+07
Range 1.0771e+07
IQR 0

Descriptive Statistics

Mean 1.6777e+06
Standard Deviation 1.3889e+06
Variance 1.9289e+12
Sum 8.9255e+08
Skewness 3.3229
Kurtosis 15.1889
Coefficient of Variation 0.8278
  • R&D_2012 is not normally distributed (p-value 1.866487174854266e-24)
  • R&D_2012 has 162 outliers

Country_code

numerical

Approximate Distinct Count 12
Approximate Unique (%) 2.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 3.7642
Minimum 1
Maximum 11
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Country_code is skewed right (γ1 = 0.5565)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 1
Median 4
Q3 4
95-th Percentile 8
Maximum 11
Range 10
IQR 3

Descriptive Statistics

Mean 3.7642
Standard Deviation 2.358
Variance 5.5603
Sum 2002.5502
Skewness 0.5565
Kurtosis -0.3664
Coefficient of Variation 0.6264
  • Country_code is not normally distributed (p-value 1.5759450561070036e-15)
  • Country_code has 18 outliers

Interactions

Correlations

Missing Values